home *** CD-ROM | disk | FTP | other *** search
-
-
-
- 104
-
- CHAPTER 11 - ADDRESSING MODES AND POINTERS
-
-
- In this chapter we are going to cover all possible ways of
- getting data to and from memory with the different addressing
- modes. Read this carefully, since it is likely this is the only
- time you will ever see ALL addressing possibilities covered.
-
- The easiest way to move data is if the data has a name and the
- data is one or two bytes long. Take the following data:
-
- ; -----
- variable1 dw 2000
- variable2 db -26
- variable3 dw -589
- ; -----
-
- We can write:
-
- mov variable1, ax
- mov cl, variable2
- mov si, variable3
-
- and the assembler will write the appropriate machine code for
- moving the data. What can we do if the data is more than two
- bytes long? Here is some more data:
-
- ; -----
- variable4 db "This is a string of ascii data."
- variable5 dd -291578
- variable6 dw 600 dup (-11000)
- ; -----
-
- Variable4 is the address of the first byte of a string of ascii
- data. Variable5 is a single piece of data, but it won't fit into
- an 8086 register since it is 4 bytes long. Variable6 is a 600
- element long array, with each element having the value -11000. In
- order to deal with these, we need pointers.
-
- Some of you will be flummoxed at this point, while those who are
- used to the C language will feel right at home. A pointer is
- simply the address of a variable. We use one of the 8086
- registers to hold the address of a variable, and then tell the
- 8086 that the register contains the address of the variable, not
- the variable itself. It "points" to a place in memory to send the
- data to or retrieve the data from. If this seems a little
- confusing, don't worry; you'll get the hang of it quickly.
-
- As I have said before, the 8086 does not have general purpose
- registers. Many instructions (such as LOOP, MUL, IDIV, ROL) work
- only with specific registers. The same is true of pointers. You
- may use only BX, SI, DI, and BP as pointers. The assembler will
- give you an error if you try using a different register as a
- pointer.
-
- ______________________
-
- The PC Assembler Tutor - Copyright (C) 1989 Chuck Nelson
-
-
-
-
- Chapter 11 - Addressing Modes 105
- _____________________________
-
-
-
- There are two ways to put an address in a pointer. For variable4,
- we could write either:
-
- lea si, variable4
-
- or:
-
- mov si, offset variable4
-
- Both instructions will put the offset address of variable4 in
- SI.{1} SI now 'points' to the first byte (the letter 'T') of
- variable4. If we wanted to move the third byte of that array
- (the letter 'i') to CL, how would we do it? First, we need to
- have SI point to the third byte, not the first. That's easy:
-
- add si, 2
-
- But if we now write:
-
- mov cl, si
-
- we will generate an assembler error because the assembler will
- think that we want to move the data in SI (a two byte number) to
- CL (one byte). How do we tell the assembler that we are using SI
- as a pointer? By enclosing SI in square brackets:
-
- mov cl, [si]
-
- since CL is one byte, the assembler assumes you want to move one
- byte. If you write:
-
- mov cx, [si]
-
- then the assembler assumes that you want to move a word (two
- bytes). The whole thing now is:
-
- lea si, variable4
- add si, 2
- mov cl, [si]
-
- This puts the third byte of the string in CL. Remember, if a
- register is in square brackets, then it is holding the ADDRESS of
- a variable, and the 8086 will use the register to calculate where
- the data is in memory.
-
- What if we want to put 0s in all the elements of variable6?
- ____________________
-
- 1 LEA stands for load effective address. Note that with LEA,
- we use only the name of the variable, while with:
-
- mov si, offset variable4
-
- we need to use the word 'offset'. The exact difference between
- the two will be explained later.
-
-
-
-
- The PC Assembler Tutor 106
- ______________________
-
- Here's the code:
-
- mov bx, offset variable6
- mov ax, 0
- mov cx, 600
- zero_loop:
- mov [bx], ax
- add bx, 2
- loop zero_loop
-
- We add 2 to BX each time since each element of variable6 is a
- word (two bytes) long. There is another way of writing this:
-
- mov bx, offset variable6
- mov cx, 600
- zero_loop:
- mov [bx], 0
- add bx, 2
- loop zero_loop
-
- Unfortunately, this will generate an assembler error. Why? If the
- assembler sees:
-
- mov [bx], ax
-
- it knows that you want to move what is in AX to the address in
- BX, and AX is one word (two bytes) long so it generates the
- machine code for a word move. If the assembler sees:
-
- mov [bx], al
-
- it knows that you want to move what is in AL to the address in
- BX, and AL is one byte long, so it generates the machine code for
- a byte move. If the assembler sees:
-
- mov [bx], 0
-
- it doesn't know whether you want a byte move or a word move. The
- 8086 assembler has implicit sizing. It is the assembler's job to
- look at each instruction and decide whether you want to operate
- on a byte or a word. Other microprocessors do things differently.
- On the Motorola 68000, the assembler uses explicit sizing. Each
- instruction must explicitly state whether it is a byte or a
- word.{2} On the 68000 you have:
-
- move.b #213, (A1)
- move.w #213, (A1)
-
- The first instruction says to move a byte (the number 213) to the
- address in register A1 while the second instruction says to move
-
-
-
- ____________________
-
- 2 Any of you who use the 68000 assembler know that this is
- fudging the facts a little bit.
-
-
-
-
- Chapter 11 - Addressing Modes 107
- _____________________________
-
- a word (the number 213) to the address in register A1.{3}
-
- Back to the 8086. If the 8086 assembler looks at an instruction
- and it can't tell whether you want to move a byte or a word, it
- generates an error. When you use pointers with constants, you
- should explicitly state whether you want a byte or a word. The
- proper way to do this is to use the reserved words BYTE PTR or
- WORD PTR.
-
- mov [bx], BYTE PTR 213
- mov [bx], WORD PTR 213
-
- These stand for byte pointer and word pointer respectively. I
- find this terminology exceptionally clumsy, but that's life.
- Whenever you are moving a constant with a pointer, you should
- specify either BYTE PTR or WORD PTR.
-
- The Microsoft assembler makes some assumptions about the size of
- a constant. If the number is 256 or below (either positive or
- negative), you MUST explicitly state whether it is a byte or a
- word operation. If the number is 257 or above (either positive or
- negative), the assembler assumes that you want a word operation.
-
- Here's the previous code rewritten correctly:
-
-
- mov bx, offset variable6
- mov cx, 600
- zero_loop:
- mov [bx], WORD PTR 0
- add bx, 2
- loop zero_loop
-
- Let's add 435 to every element in the variable6 array:
-
- mov bx, offset variable6
- mov cx, 600
- add_loop:
- add [bx], WORD PTR 435
- add bx, 2
- loop add_loop
-
- How about multiplying every element in the array by 12?
-
- mov di, offset variable6
- mov cx, 600
- mov si, 12
- mult_loop:
- mov ax, [di]
- imul si
- mov [di], ax
- add di, 2
- loop mult_loop
-
- ____________________
-
- 3 A1 is a 68000 register.
-
-
-
-
- The PC Assembler Tutor 108
- ______________________
-
- None of these examples did any error checking, so if the result
- was too large, the overflow was ignored. This time we used DI for
- a change of pace. Remember, we may use BX, SI, DI or BP, but no
- others. You will notice that in all these examples, we started at
- the beginning of the array and went step by step through the
- array. That's fine, and that's what we normally would do, but
- what if we wanted to look at individual elements? Here's a sample
- program:
-
- ; + + + + + START DATA BELOW THIS LINE
- ;
- poem_array db "She walks in Beauty, like the night"
- db "Of cloudless climes and starry skies;"
- db "And all that's best of dark and bright"
- db "Meet in the aspect ratio of 1 to 3.14159"
- character_count db 149
- ; + + + + + END DATA ABOVE THIS LINE
-
- ; + + + + + START CODE BELOW THIS LINE
-
- mov bx, offset poem_array
- mov dl, character_count
-
- character_loop:
- sub ax, ax ; clear ax
- call get_unsigned_byte
- dec al ; character #1 = array[0]
- cmp al, dl ; out of range?
- ja character_loop ; then try again
- mov si, ax ; move char # to pointer register
- mov al, [bx+si] ; character to al
- call print_ascii_byte
- jmp character_loop
-
- ; + + + + + END CODE ABOVE THIS LINE
-
- You enter a number and the program prints the corresponding
- character. Before starting, we put the array address in BX and
- the maximum character count in DL. After getting the number from
- get_unsigned_byte, we decrement AL since the first character is
- actually poem_array[0]. The character count has been reduced by 1
- to reflect this fact. It also makes 0 an illegal entry. Notice
- that the program checks to make sure you don't go past the end of
- the poem. This time we use BX to mark the beginning of the array
- and SI to count the number of the character.
-
- Once again, there are only specific combinations of pointers that
- can be used. They are:
-
- BX with either SI or DI (but not both)
- BP with either SI or DI (but not both)
-
- My version of the Microsoft assembler (v5.1) recognizes the forms
- [bx+si], [si+bx], [bx][si], [si][bx], [si]+[bx] and [bx]+[si] as
- the same thing and produces the same machine code for all six.
-
-
-
-
-
-
- Chapter 11 - Addressing Modes 109
- _____________________________
-
- We can get even more complicated, but to show that, we need
- structures. In databases they are called records. In C they are
- called structures; in any case they are the same thing - a group
- of different types of data in some standard order. After the
- group is defined, we usually make an array with the identical
- structure for each element of the array.{4} Let's make a
- structure for an address book.
-
- last_name db 15 dup (?)
- first_name db 15 dup (?)
- age db ?
- tel_no db 10 dup (?)
-
- In this case, all the data is bytes, but that is not necessary.
- It can be anything. Each separate piece of data is called a
- FIELD. We have the last_name field, the first_name field, the age
- field, and the tel_no field. Four fields in all. The structure is
- 41 bytes long. What if we want to have a list of 100 names in our
- telephone book? We can allocate memory space with the following
- definition:
-
- address_book db 100 dup ( 41 dup (' ')) {5}
-
- Well, that allocates room in memory, but how do we get to
- anything? First, we need the array itself:
-
- mov bx, offset address_book
-
- Then we need one specific entry. Let's take entry 29 (which is
- address_book[28]). Each entry is 41 bytes long, so:
-
- mov ax, 28 ; entry (less 1)
- mov cx, 41 ; entry length
- mul cx
- mov di, ax ; move to pointer
-
- That gives us the entry, but if we want to get the age, that's
- not the first byte of the structure, it's the 31st byte (actually
- address_book[28] + 30 since the first byte is at +0). We get it
- by writing:
-
- mov dl, [bx+di+30]
-
- This is the most complex thing we have - two pointers plus a
- constant. The total code is then:
-
- mov bx, offset address_book
- mov ax, 28 ; entry (less 1)
- mov cx, 41 ; entry length
- ____________________
-
- 4 If you don't know about structures or records, now would be
- a good time to stop and go to a reference book about them. They
- are not actually covered here.
-
- 5 Nesting of dup statements is allowed. Rather than having
- uninitialized data, this has blanks in all the spaces.
-
-
-
-
- The PC Assembler Tutor 110
- ______________________
-
- mul cx ; entry offset from array[0]
- mov di, ax ; move entry offset to pointer
- mov dl, [bx+di+30] ; total address
-
- Though the machine code has only one constant in the code, the
- assembler will allow you to put a number of constants in the
- assembler instruction. It will add them together for you and
- resolve them into one number.{6}
-
- Once again, there are a limited number of registers - they are
- the same registers as before:
-
- BX with either SI or DI (but not both) plus constant
- BP with either SI or DI (but not both) plus constant
-
- We can work with structures on the machine level, but it looks
- like it's going to be hard to keep track of where each field is.
- Actually, it isn't so bad because of:
-
- OUR FRIEND, THE EQU STATEMENT
-
- The assembler allows you to do substitution. If you write:
-
- somestuff EQU 37 * 44
-
- then every place that the assembler finds the word "somestuff",
- it will substitute what is on the right side of the EQU. Is that
- a number or text? Sometimes it's a number, sometimes it's text.
- Here are four statements which are defined totally in terms of
- numbers. This is from the assembler listing. (The assembler lists
- how it has evaluated the EQU statement on the left after the
- equal sign.)
-
-
-
-
- = 0023 statement1 EQU 5 * 7
- = 0025 statement2 EQU statement1 + 6 - 4
- = 000F statement3 EQU statement2 - 22
- = 001F statement4 EQU statement3 + 16
-
- and the assembler thinks of these as numbers (these numbers are
- in hex). Now in the next set, with only a minor change:
-
-
- = [bp + 3] statement1 EQU [bp + 3]
- = [bp + 3] + 6 - 4 statement2 EQU statement1 + 6 - 4
- = [bp + 3] + 6 - 4 - 22 statement3 EQU statement2 - 22
- ____________________
-
- 6 And it does it quite well. The assembler correctly evaluated
- the following:
-
- add ax, (-3*81)+44/8+[si+27]+6+[bx]-7+(43*96)-2
-
- Not bad, huh?
-
-
-
-
-
- Chapter 11 - Addressing Modes 111
- _____________________________
-
- = [bp + 3] + 6 - 4 - 22 + 16 statement4 EQU statement3 + 16
-
- the assembler thinks of it as text. Obviously, the fact that it
- can be either may cause you some problems along the way. Consult
- the assembler manual for ways to avoid the problem.
-
-
- Now we have a tool to deal with structures. Let's look at that
- structure again.
-
- last_name db 15 dup (?)
- first_name db 15 dup (?)
- age db ?
- tel_no db 10 dup (?)
-
- We don't actually need a data definition to make the structure,
- we need equates:
-
- LAST_NAME EQU 0
- FIRST_NAME EQU 15
- AGE EQU 30
- TEL_NO EQU 31
-
- this gives us the offset from the beginning of each record. If we
- again define:
-
- address_book db 100 dup ( 41 dup (' '))
-
- then to get the age field of entry 87, we write:
-
- mov bx, offset address_book
- mov ax, 86 ; entry (less 1)
- mov cx, 41 ; entry length
- mul cx ; entry offset from array[0]
- mov di, ax ; move entry offset to pointer
- mov dl, [bx+di+AGE] ; total address
-
- This is a lot of work for the 8086, but that is normal with
- complex structures. The only thing that takes a lot of time is
- the multiplication, but if you need it, you need it.{7}
-
- How about a two dimensional array of integers, 60 X 40
-
- int_array dw 40 dup ( 60 dup ( 0 ))
-
- These are initialized to 0. For our purposes, we'll assume that
- the first number is the row number and the second number is the
- column number; i.e. array [6,13] is row 6, column 13. We will
- have 40 rows of 60 columns. For ease of calculation, the first
- array element is int_array [0,0]. (If it is your array, you can
-
-
-
-
- ____________________
-
- 7 You will see more of the EQU statement.
-
-
-
-
- The PC Assembler Tutor 112
- ______________________
-
- set it up any way you want {8}). Each row is 60 words (120 bytes)
- long. To get to int_array [23, 45] we have:
-
- mov ax, 120 ; length of one row in bytes
- mov cx, 23 ; row number
- mul cx
- mov bx, ax ; row offset to bx
- mov si, 45 ; column offset
- sal si, 1 ; multiply column offset by 2 (for word size)
- mov dx, [bx+si] ; integer to dx
-
- Using SAL instead of MUL is about 50 times faster. Since most
- arrays you will be working with are either byte, word, or double
- word (4 bytes) arrays, you can save a lot of time. Let
- ELEMENT_NUMBER be the array number (starting at 0) of the desired
- element in a one-dimensional array. For byte arrays, no
- multiplication is needed. For a word:
-
- mov di, ELEMENT_NUMBER
- sal di,1 ; multiply by 2
-
- and for a double word (4 bytes):
-
- mov di, ELEMENT_NUMBER
- sal di, 1
- sal di, 1 ; multiply by 4
-
- This means that a one-dimensional array can be accessed very
- quickly as long as the element length is a power of 2 - either 2,
- 4 or 8. Since the standard 8086 data types are all 1, 2, 4, or 8
- bytes long, one dimensional arrays are fast. Others are not so
- fast.
-
- As a quick review before going on, these are the legal ways to
- address a variable on the 8086:
-
- (1) by name.
-
- mov dx, variable1
-
- It is also possible to have name + constant.
-
- mov dx, variable1 + 27
-
- The assembler will resolve this into a single offset number
- and will give the appropriate information to the linker.
-
- (2) with the single pointers BX, SI, DI and BP (which are
- enclosed in square brackets).
-
- mov cx, [si]
- ____________________
-
- 8 Bearing in mind that all compiled languages have fixed
- formats for arrays. If you want your array to interact with C,
- Fortran, Pascal or Basic, you'd better be sure you have the right
- format.
-
-
-
-
- Chapter 11 - Addressing Modes 113
- _____________________________
-
- xor al, [bx]
- add [di], cx
- sub [bp], dh
-
- (3) with the single pointers BX, SI, DI and BP (which are
- enclosed in square brackets) plus a constant.
-
- mov cx, [si+421]
- xor al, 18+[bx]
- add 93+[di]-7, cx
- sub (54/7)+81-3+[bp]-19, dh
-
- (4) with the double pointers [bx+si], [bx+di], [bp+si],
- [bp+di] (which are enclosed in square brackets).
-
- mov cx, [bx][si]
- xor al, [di][bx]
- add [bp]+[di], cx
- sub [di+bp], dh
-
- (5) with the double pointers [bx+si], [bx+di], [bp+si],
- [bp+di] (which are enclosed in square brackets) plus a
- constant.
-
- mov cx, [bx][si+57]
- xor al, 45+[di+23][bx+15]-94
- add [bp]+[di]-444, cx
- sub [6+di+bp]-5, dh
-
- These are ALL the addressing modes allowed on the 8086. As for
- the constants, it is the ASSEMBLER'S job to resolve all numbers
- in the expression into a single constant. If your expression
- won't resolve into a constant, it is between you and the
- assembler. It has nothing to do with the 8086 chip.
-
-